Measuring the Usefulness of Function Words for Authorship Attribution
نویسنده
چکیده
S ome forty years ago, Mosteller and Wallace suggested in their influential work on the Federalist Papers that a small number of the most frequent words in a language ('function words') could usefully serve as indicators of authorial style. The decades since have seen this work taken up in many ways including both the use of new analysis techniques (discriminant analysis, PCA, neural networks, and more), as well as the search for more sophisticated features by which to capture stylistic properties of texts. Interestingly, while use of more sophisticated models and algorithms have often led to more reliable and generally applicable results, it has proven quite difficult to improve on the general usefulness of function words for stylistic attribution. Indeed, John F. Burrows in his seminal work on Jane Austen has demonstrated that function words can be quite effectively used for attributing text passages to different authors, novels, or individual characters.
منابع مشابه
Function Words for Chinese Authorship Attribution
This study explores the use of function words for authorship attribution in modern Chinese (C-FWAA). This study consists of three tasks: (1) examine the C-FWAA effectiveness in three genres: novel, essay, and blog; (2) compare the strength of function words as both genre and authorship indicators, and explore the genre interference on C-FWAA; (3) examine whether C-FWAA is sensitive to the time ...
متن کاملFunction Words in Authorship Attribution. From Black Magic to Theory?
This position paper focuses on the use of function words in computational authorship attribution. Although recently there have been multiple successful applications of authorship attribution, the field is not particularly good at the explication of methods and theoretical issues, which might eventually compromise the acceptance of new research results in the traditional humanities community. I ...
متن کاملStyle-Markers in Authorship Attribution A Cross-Language Study of the Authorial Fingerprint
Th e present study addresses one of the theoretical problems of computer-assisted authorship attribution, namely the question which traceable features of language can betray authorial uniqueness (a stylistic fi ngerprint) of literary texts. A number of recent approaches show that apart from lexical measures — especially those relying on the frequencies of the most frequent words — also some oth...
متن کاملTowards a better understanding of Burrows's Delta in literary authorship attribution
Burrows’s Delta is the most established measure for stylometric difference in literary authorship attribution. Several improvements on the original Delta have been proposed. However, a recent empirical study showed that none of the proposed variants constitute a major improvement in terms of authorship attribution performance. With this paper, we try to improve our understanding of how and why ...
متن کاملLitLin 19_4 453-475 fqh034 FIN
Delta, a simple measure of the difference between two texts, has been proposed by John F. Burrows as a tool in authorship attribution problems, particularly in large ‘open’ problems in which conventional methods of attribution are not able to limit the claimants effectively. This paper tests Delta’s effectiveness and accuracy, and shows that it works nearly as well on prose as it does on poetry...
متن کامل